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Abstract 

In this paper, we perform statistical segmentation and clustering analysis of the 
Dow Jones Industrial Average time series between January 1997 and August 2008. 
Modeling the index movements and log-index movements as stationary Gaussian 
processes, we find a total of 1 16 and 1 19 statistically stationary segments respec- 
tively. These can then be grouped into between five to seven clusters, each rep- 
resenting a difi'erent macroeconomic phase. The macroeconomic phases are dis- 
tinguished primarily by their volatilities. We find the US economy, as measured 
by the DJI, spends most of its time in a low-volatility phase and a high-volatility 
phase. The former can be roughly associated with economic expansion, while 
the latter contains the economic contraction phase in the standard economic cy- 
cle. Both phases are interrupted by a moderate- volatility market, but extremely- 
high- volatility market crashes are found mostly within the high- volatility phase. 
From the temporal distribution of various phases, we see a high-volatility phase 
from mid- 1998 to mid-2003, and another starting mid- 2007 (the current global 
financial crisis). Transitions from the low- volatility phase to the high-volatility 
phase are preceded by a series of precursor shocks, whereas the transition from 
the high- volatility phase to the low-volatility phase is preceded by a series of in- 
verted shocks. The time scale for both types of transitions is about a year. We also 
identify the July 1997 Asian Financial Crisis to be the trigger for the mid- 1998 
transition, and an unnamed May 2006 market event related to corrections in the 
Chinese markets to be the trigger for the mid-2007 transition. 
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1. Introduction 

Most people remember the most recent economic recession as short (lasting 
only eight months from March 2001 to November 2001 [1]) and mild (affect- 
ing mostly high-tech companies). Against this backdrop, there have been many 
sensationalist claims that the current global financial crisis is the deepest (broad 
spectrum of economic sectors affected) and longest (peak in December 2007 [HI], 
and a potential trough in March 2009). According to other sources (see, for ex- 
ample, Ref. [2]), however, the Subprime Crisis surfaced around July 2007 with a 
slew of bad news from subprime lenders, and the Dow Jones Industrial Average 
(DJI) dipping roughly 1,000 points going from July 2007 to August 2007. Since 
then, billions of dollars have been sunk into relief and stimulus packages, and gov- 
ernments around the world are planning further aid totalling in excess of a trillion 
US dollars. There are hardly any positive results to show for the effort thus far, 
and the reasons can best be summed up as "too little, too late". In medicine, early 
intervention is generally more effective and less costly compared to a late cure. 
The same is probably true for economies and financial markets. Clearly, even if 
we are not sure what kind of intervention measures will be effective, acting early 
is still more desirable to acting later. To accomplish this, it is important to be able 
to unambiguously detect the onset of a financial crisis, so that we can at the same 
time avoid over-reacting when the market has merely caught a 'cold'. 

Since econometric data such as the gross national product (GNP) are released 
quarterly, and are adjusted monthly, they are not useful for timely detection. We 
thus look to higher-frequency financial time series for this sleuth work. Given that 
each and every financial crisis may have their own unique and esoteric characters, 
we need a financial time series that is sufficiently representative of the broad spec- 
trum of industries to be able to detect the starting point of these crises. Indices 
such as the Dow Jones Industrial Average (DJI), Dow Jones Composite Average 
(DJA), and the Standards & Poors 500 (INX) are most suitable for this purpose. 
Clearly, detecting the onset of a financial crisis is a change point problem pi |4|]. 
In their seminal works, Goldfeld et al, Hamilton and Kim et al fitted a Markov- 
switching model to local trends in the US GNP time series to detect transitions 
between a macroeconomic phase (or regime) with high growth rate and a macroe- 
conomic phase with low growth rate [H, III]. Unlike econometric time series, 
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which evolve fairly slowly with time, it is well known that financial time series 
exhibit dynamics on multiple time scales. To avoid potential complications aris- 
ing from such multiscale dynamics, we analyze statistical fluctuations in the index 
time series, instead of looking merely at the local trend, as is done for deciding 
the duration of an economic recession. For the different macroeconomic phases 
the economy and financial market can be found in, these statistical flucutations 
should also be qualitatively different. 

In this paper, we describe in Section [21 a model-based approach to statistically 
segmenting the DJI time series, which is assumed to consist of a large number of 
statistically stationary segments. Within different segments, the index movements 
(or log index movements) are assumed to follow stationary Gaussian processes 
with different means and variances. We then discover these segments using a 
recursive segmentation scheme based on the relative entropy between them. Fol- 
lowing this, we determine the small number of macroeconomic phases represented 
in the time series by performing agglomerative hierarchical clustering on the seg- 
ments. In Section |3l we report findings from our statistical segmentation and 
clustering analyses. Segments obtained using the two models are in good agree- 
ment with each other, and also with the dates of major market events, suggesting 
that the segment boundaries discovered are robust and meaningful. Depending on 
the model, and the level of granularity we choose, we find between six to seven 
macroeconomic phases after clustering the segments. These six to seven macroe- 
conomic phases are distinguished primarily by their variances, which represent 
market volatilities. While the clusters appear to be less robust compared to the 
segments, their temporal distributions do tell a fairly consistent story: the US 
market, as measured by the DJI, is found predominantly in a low-volatility phase 
and a high-volatility phase, corresponding roughly to economic expansion and 
economic contraction respectively. Both phases are interrupted by a moderate- 
volatility market correction phase, while the high- volatility phase is also inter- 
rupted by an extremely-high- volatility market crash phase. More interestingly, our 
results suggest that the mid- 1998 transition into the high- volatility phase (which 
lasted five years) was triggered by the 1997 Asian Financial Crisis, whereas the 
mid-2007 transition into the high-volatility phase (the global financial crisis we 
find ourselves in right now) was triggered by a 2006 correction in the Chinese 
markets. As we have guessed, the world is very tightly coupled economically, 
perhaps even more so than we would like to admit. We then conclude in Section 
m and describe further work we are currently undertaking. 



3 



2. Data, Models and Methods 

2.1. Data and Models 

While it is not as comprehensive as the S&P 500, the Dow Jones Industrial 
Average (a price-weighted index consisting of 30 of the largest and most widely 
held public companies in US) is nonetheless a very important index measuring the 
performance of the US market. Tic-by-tic data for this index between 1 January 
1997 and 31 August 2008 was downloaded from the Taqtic database fsl], and 
processed to give a half -hourly time series X = {Xi,X2, . . . ,Xf^), where X, is 
the index value at the rth half-hour, and A'^ is the total number of trading half- 
hours between 1 January 1997 to 31 August 2008. The half -hourly frequency was 
chosen so that there is sufficient statistics to identify segments as short as a single 
day. From the index time series X, we obtain the index movement time series 
X = {x\,. . Xf^-i), where x, = X, - as well as the log-index movement time 
series y = {y\, . . .,yN^\), where y, = \ogX, - logX,_i. We assume that x and 
y consist of M and M' statistically stationary segments respectively, where the 
numbers of segments M and M', and where the segments are, are unknown and 
must be determined through a segmentation procedure. 

To do this segmentation, we assume the movements x, within statistically sta- 
tionary segment m are drawn from a Gaussian {normal) distribution with mean 
jj-m and variance a^. Similarly, the movements jt within statistically stationary 
segment m' are assumed to be drawn from a Gaussian distribution with mean /i'^^, 
and variance cr'^,,^. The log-normal index movement model is popular in the fi- 
nance literature, where traders are assumed to be influenced mainly by percentage 
changes rather than absolute changes, because of their constant mental reference 
to a risk-free interest rate. In this study, we also consider the normal index move- 
ment model, in case traders in the real world also pay attention to actual changes 
in the index. In both models, the movements from one half-hour to the next are 
uncorrected, in contrast to real- world financial time series, which are known to 
exhibit correlations on multiple time scales. For the purpose of finding statistically 
robust change points in the time series, we believe that the details of the models 
used will not be important, and the diff"erence between an uncorrected model ver- 
sus a correlated model will merely be a difl'erence between statistical significance 
and signal-to-noise ratio. 

2.2. Methods 

Time series segmentation schemes can be very broadly classified into those 
based on pattern recognition, and those based on information-theoretic measures. 
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In pattern-based segmentation schemes, features within the time series are ab- 
stracted into symbols, as is frequently done in the technical analysis of stock mar- 
kets [9]. Segmentation decisions are then based on the relative abundance of sym- 
bols, or their context trees (l^ II, 12, 13]. Information-theoretic segmentation 
methods are popular in image segmentation iflill . biological sequence segmenta- 



tion il5\\, and also in medical time series analysis lll6ll . but not widely used for 
financial time series segmentation [llTl llSn. 

To determine the location of the M segments, we em ploy the recursive seg- 
mentation scheme introduced by Bernaola-Galvan et al 1 19, |2fl] for biological 
sequence segmentation. In this scheme, we first identify a cursor position t in 
the sequence z = {z\_,Z2, ■■ ■,Zn) with length A^^, and compute the Jensen-Shannon 
divergence 

Af = log— — , (1) 

which measures the statistical divergence between the left subsequence Zl = 
izuZi, ■■■,Zt) and the right subsequence Zr = (zt+uZt+i, ■ ■ ■,Zn)- Here, 
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is the likelihood for observing the sequence z, assuming that the left subsequence 
Zl = (zi, Z2, • • • , Zf) is generated by a Gaussian process with mean /z^ and variance 
cr\, and the right subsequence Tr = (zt+i,Zt+2, ■ ■ ■,Zn) is generated by a Gaussian 
process with mean /ur and variance cr^. 

Since the parameters yu, fj.R, cr^, cr\, and cr\ are not given, we can replace 
them with their maximum-likelihood estimates fi, fit, fiR, o"^, and These 
estimates maximizes and Piit) relative to the data, and the Jensen-Shannon 
divergence, which simplifies to 



A, = A^log d- - hl log d-L - riR log &r + ]^>Q, 



(4) 
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tells us how much better the best two-segment model fits the observed data over 
the best one-segment model. If we now vary t, and identify t = t* for which 
Af. = A* = maXf A,, this would tell us the best place to segment the given sequence 
z = {z\_,Z2, ■ ■ ■ ,Zn)- The Jensen-Shannon divergence maximum A* gives us an 
indication of how significant the segment boundary at t* is statistically. 

We then repeat this one-into-two segmentation procedure to recursively cut the 
given sequence up into shorter and shorter segments. As this recursive segmen- 
tation progresses, the divergence maxima for the new cuts will generally become 
smaller and smaller. At some point, new cuts will no longer be statistically signif- 
icant, and the segmentation process must be terminated. There are several ways 
to do this: through hypothesis testing [T^il^, through model selection, II21I I22I1. 
or through examination of the intrinsic statistical fluctuations within the sequence 
to be segmented [23]. In this work, we adopted a semi-automated approach to ter- 
minate the recursive segmentation. First, we recursively segment the time series 
until the divergence maxima of the new cuts fall below a given threshold, selected 
by inspection to be Aq = 10. We then screen these segments manually, by visually 
inspecting the Jensen-Shannon divergence spectrum A,, to decide whether very 
short segments should be eliminated, and very long segments should be further 
segmented. 

At each stage of the recursive segmentation, we also perform segmentation 
optimization, to overcome the context sensitivity problem identified in Ref. uM- 
For this, we use the algorithm described in Ref. ll23n . where we start with M seg- 
ment boundaries {fi, . . . , tM} obtained after new cuts have been introduced by the 
recursive segmentation. To optimize the position of the mth segment boundary, 
we compute the Jensen-Shannon divergence spectrum A, within the supersegment 
(jc,,_,_(+i, . . . , Jc?,„+i) bounded by the segments boundaries tm-i and ?„,+i, and replace 
t,n by where the supersegment Jensen-Shannon divergence is maximized. This 
is done for all M segment boundaries, and iterated until all segment boundaries 
converge to their optimal positions. We then continue the recursive segmentation 
with this optimized set of segments, introducing new cuts, optimize the new seg- 
ment boundaries along with the old segment boundaries, until the segmentation is 
terminated. 

Finally, after we are satisfied that the final segmentation is optimal, and the 
segment boundaries are all statistically significant, we perform agglomerative hi- 
erarchical clustering on the segments to determine the number of macroeconomic 
phases represented in the time series. This is done with the complete link al- 
gorithm [I25I1 . using the Jensen-Shannon divergences between segments as their 
statistical distances. Clustering of diff"erent periods within a financial time series 
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has been previously investigated ||26L |27|, |28l], but we believe we are the first to 
incorporate a rigorous segmentation analysis into such a study. 



3. Results and Discussions 

3. 1 . Statistical Segmentation 

From the DJI time series between January 1997 and August 2008, we found 
a total of 116 segments using the normal index movement model, and a total of 
119 segments for the log-normal index movement model. Most of the optimized 
segment boundaries found are either mid-days or end-of-days, in agreement with 
the start-of-day and end-of-day buzz, and mid-day lull observed in practically all 



financial markets 11291 13011 . We say that a segment boundary is common between 
the two sets if its positions in the two models diff"er by at most one day. A total 
of 85 common segment boundaries are found, out of which 37 are at the same 
exact half-hour. This tells us that most of the segment boundaries discovered are 
extremely robust. As shown in Figure [31 these robust segment boundaries agree 
very well with the dates of important market events. In Table [H we also show 
the intervals where the segmentations from the the two models disagree. These 
intervals are bound by very robust segment boundaries, and most of these intervals 
correspond to highly volatile periods in the DJI time series. Within these intervals, 
disagreement between the two models is primarily in the form of diff"erent number 
of segment boundaries. We surmise that the statistical fluctuations within these 
intervals are highly nonstationary, and thus not well described by a collection of 
stationary models. Even so, we find many common segment boundaries within 
these intervals. 

3.2. Statistical Clustering 

In their classic studies [Hi |6l], Goldfeld et al and Hamilton assumed only two 
macroeconomic phases for the US GNP. More recently, Sims and Zha assumed 



four phases in their analysis of the history of US monetary policy [|31|1 . In gen- 
eral, economists believe in the existence of only a small number of macroeco- 
nomic phases. On the large scale, the textbook economic cycle consists of recur- 
rent switches between an economic expansion phase and an economic contraction 
phase. On a smaller scale, economists also acknowledge the existence of a mar- 
ket correction phase and a market crash phase. Based on our clustering analysis 
of the segments, we find indeed a small number of clusters, as shown in Figure 
[U For the normal index movement model, we find between five to seven clus- 
ters of segments, depending on the level of granularity we choose. Similarly, the 
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Table 1: Intervals within the January 1997 to August 2008 period where segmentations of the 
normal index movement model and log-normal index movement model disagree. 



start date 


end date 


number of segments 


common 


normal index 
movement model 


log-normal index 
movement model 


boundaries 


Nov 3, 1997 


Mar 31, 1998 


4 


5 





Aug 26, 1998 


Oct 20, 1998 


3 


2 





Jan 13, 1999 


Nov 5, 1999 


3 


7 





Mar 9, 2001 


Jun 3, 2002 


18 


10 


6 


Oct 16, 2002 


Aug 6, 2003 


9 


6 


2 


Mar 10, 2004 


Oct 18,2005 


3 


8 


1 


Jul 28, 2006 


Aug 15, 2006 


1 


2 





Sep 5, 2006 


Dec 27, 2006 


4 


1 





Jul 25, 2007 


Mar 10, 2008 


7 


14 


4 



hierarchical clustering tree of the log-index movement model suggests seven clus- 
ters of segments. For both models, the coarsest description that is reasonable and 
informative is in terms of three clusters of segments. 

When we plot a scatter diagram of the segment means and standard devia- 
tions, as shown in Figure we see that the clusters are distinguished primarily 
through their standard deviations, i.e. their market volatilities. Adopting a heat- 
map-like colour scheme for the clusters, we colour the low-volatility clusters deep 
blue and blue, the moderate-volatility clusters cyan and green, the high-volatility 
clusters yellow and orange, and the extremely-high-volatility clusters red. Using 
this colour scheme, we plot the temporal distributions of clustered segments for 
the two models as Figure [31 The two temporal distributions agree qualitatively 
on the existence of a low-volatility phase between mid-2003 to end-2006, and a 
high- volatility phase within 2008. However, we find the log-normal index move- 
ment model exaggerates small statistical divergences, at the same time playing 
down large statistical divergences. As such, there is higher temporal contrast at 
low market volatilities, and lower temporal contrast at high market volatilities. In 
comparison, the normal index movement model, with its uniform contrast between 
market volatilities, tells us a much more interesting story: over the period January 
1997 to August 2008, the US market, as measured by the DJI, is found predom- 
inantly in the low-volatility (deep blue and blue) and high-volatility (yellow and 
orange) phases. By visual inspection of the DJI time series, we see that the low- 
volatility phase has a natural interpretation as the economic expansion phase, but 
while the high-volatility phase contains the economic contraction phase, its dura- 
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Figure 1: The complete-link hierarchical clustering trees of the segments obtained using the nor- 
mal index movement model (top) and the log-normal index movement model (bottom). The dif- 
ferentiated clusters are coloured according to their market volatilities: low (deep blue and blue), 
moderate (cyan and green), high (yellow and orange), and extremely high (red). Also shown at the 
major branches are the Jensen-Shannon divergence values at which subclusters are merged. 
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Figure 2: Means and standard deviations of the segments obtained using the normal index move- 
ment model (left) and the log-normal index movement model (right). As we can see, the clusters 
are differentiated primarily by their standard deviations. 



tion is significantly longer. From this point on, we will limit our discussions to 
the normal index movement model. 

As we can see from Figure [31 both the low-volatility phase and the high- 
volatility phase are interrupted by a moderate-volatility market correction phase 
(green). In the normal index movement model, segments within this phase have 
very consistent standard deviations of about 20 index points. The length distri- 
bution of these market correction segments, however, is bimodal, with one group 
lasting between 100-200 half-hours (1-2 weeks), and another group lasting be- 
tween 700-900 half-hours (1.5-2 months). In general, we find more short correc- 
tion segments within the low-volatility phase, and more long correction segments 
within the high- volatility phase. The high-volatility phase is also interrupted fre- 
quently by an extremely-high-volatility market crash phase, which sports a broad 
range of standard deviations from 50 to 150 index points. Crash segment lengths 
were also found to fall into three groups: between 10-40 half-hours (1-3 days), 
around 100 half-hours (1 week), and between 200-300 half-hours (2-3 weeks). 

3. 3. Temporal Distribution of Clustered Segments 

Most importantly, the temporal distribution of the clustered segments between 
January 1997 and August 2008 indicates the US market made a transition from 
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Figure 3: Temporal distributions of the clustered segments for the normal index movement model 
(top), and the log-normal index movement model (bottom). The red solid lines indicate the dates 
of important market events: (1) July 1997 Asian Financial Crisis; (2) October 1997 Mini Crash; 
(3) August 1998 Russian Financial Crisis; (4) DJI 2000 High; (5) NASDAQ Crash; (6) start of 
2001 recession; (7) Sep 11 Attack; (8) end of 2001 recession; (9) DJI 2002 Low; (10) February 
2007 Chinese Correction. 



11 



the low-volatility phase to the high- volatility phase in mid- 1998, went back to the 
low- volatility phase in mid-2003, and again switched back to the high- volatility 
phase in mid-2007. The first high- volatility phase observed in this period lasted 
five years, within which we find not only the official March-November 2001 re- 
cession, but also the 2000 high in the DJI. It is generally believed that the DJI 
2000 high is the result of the Dot-Com Bubble, even though the March 2000 
NASDAQ Crash did not even registered on the DJI. Very interestingly, apart from 
more or less isolated market corrections, we find a series of market corrections 
which gets more and more severe prior to the mid- 1998 phase transition. We real- 
ized that these are precursor shocks similar in nature to those found by Somette et 
al preceding market crashes |32,[33il34i|35|]. From Figure [3l we see that the first 
precursor shock appeared right after the July 1997 Asian Financial Crisis. This 
suggests, at least on face value, that the mid- 1998 transition was triggered by the 
Asian Financial Crisis. Looking at the end of this first high-volatility phase, we 
find a series of inverted shocks, starting shortly after the DJI 2002 low. Just like 
the precursor shocks preceding the low-to-high transition, these low- to moderate- 
volatily inverted shocks went on for about a year before the US market made the 
high-to-low phase transition. Though we do not yet understand the nature of these 
shocks and inverted shocks, it is likely that they are generic features in the dynam- 
ics of stock markets. 

The second high-volatility phase observed in the DJI time series is none other 
than the present global financial crisis. Depending on the sources, the Subprime 
Crisis, which catalyzed the current global financial crisis, is dated as early as July 
2007. On the surface, there seems to be no connection between this gradual down- 
turn, and the Feb 2007 market crash known as the Chinese Correction. However, 
we find the Chinese Correction sitting in the middle of a year-long precursor shock 
period starting in May 2006, marked by a less severe market event that also had 
to do with corrections in the Chinese markets. Again, on face value, the US fi- 
nancial crisis appears to be triggered by structural upheavals in a foreign market. 
However, given that US has substantial investment interests in China, it is not 
clear from our observations what the true causes and effects might be. Between 
September 2008 and April 2009, we have yet to detect any inverted shocks, al- 
though it is likely the DJI has seen its lowest point of this crisis in March 2009. In 
the most optimistic scenario that we start finding inverted shocks in April or May 
2009, and assuming the fundamental dynamics underlying these entities have not 
changed from the previous crisis to the present crisis, we can expect the US market 
to complete the high-to-low phase transition (effectively an economic recovery) in 
mid-2010. 
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Finally, after learning so much from the DJI time series, it is natural to ask if it 
is possible to avert an impending financial crisis, if early detection based on pre- 
cursor shocks is reliable. To answer such a question, we will need to understand 
the interplay between factors that caused the precursor shocks. At the very worst, 
if we cannot understand the nature of these precursor shocks, they would remain 
useful as early warning indicators of the financial crisis. Our hope then would be 
that intervention measures meted out early may be able to soften the crisis, and 
perhaps even shorten it. Equally important, if we can understand what we did 
in the previous crisis that culminated in the inverted shocks, we might be able to 
develop more systematic measures to aid recovery from the current crisis. 

4. Conclusions 

We performed statistical segmentation of the DJI time series between Jan- 
uary 1997 and August 2008, using an optimized recursive segmentation scheme 
derived from that introduced by Bernaola-Galvan et al. We assumed normal as 
well as log-normal index movements in each unknown statistically stationary seg- 
ment of the time series, and used the Jensen-Shannon divergence as the statisti- 
cal distance between segments. Adopting the termination heuristic described in 
Section [21 we found 116 segments for the normal index movement model, and 
1 19 for the log-normal index movement model. These two segmentations agree 
very well with each other, suggesting that the segment boundaries discovered are 
statistically robust. We then performed agglomerative hierarchical clustering of 
the segments using the complete-link algorithm, to find that the large number of 
segments can be assigned to between five and seven clusters. These clusters are 
distinguished primarily by their variances, and represent low-volatility, moderate- 
volatility, high- volatility, and extremely-high- volatility macroeconomic phases. 

Based on the temporal distribution of the clustered segments, we saw that 
the US economy, as measured by the DJI, is found predominantly in the low- 
volatility phase or the high-volatility phase. The low-volatility phase corresponds 
very roughly to the economic expansion phase of the standard economic cycle. 
In contrast, the accepted economic contraction phase is completely nested within 
the much longer high-volatility phase. Both phases are interrupted frequently 
by the week-long or month-long moderate-volatility market correction phases. 
Market crashes, which form a distinct macroeconomic phase with extremely high 
volatility, occur with durations ranging from one day to three weeks, and is almost 
exclusively found within the high-volatility phase. Within the period studied, we 
found the high- volatility occuring only twice. The first such interval was from 
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mid- 1998 to mid-2003. The second interval is the ongoing global financial crisis 
which, according to our results, started in mid-2007. 

From the temporal distribution of clustered segments, we found a series of 
moderate- volatility precursor shocks preceding the mid- 1998 low-to-high phase 
transition, and also a series of moderate-volatility inverted shocks preceding the 
mid-2003 high-to-low phase transition, which is associated with economic recov- 
ery that started with the DJl 2002 low. There is also a series of precursor shocks 
preceding the mid-2007 low-to-high phase transition that brought many financial 
giants around the world to their knees. The time scale for all transitions identified 
from the DJI time series is about a year. We suspect inverted shocks would again 
appear roughly a year before the end of the current financial crisis. The impli- 
cation of this finding is that, if we do find inverted shocks trailing the the March 
2009 low, and take these as the start of the economic recovery, the US economy 
might find itself back in the low- volatility phase sometime in the middle of 2010. 
Should this optimistic scenario pan out, the current high- volatility global financial 
crisis would have lasted about three years, compared to five years for the previous 
high-volatility phase. 

From the DJl time series data alone, we see at face value that the mid- 1998 
transition was triggered by the July 1997 Asian Financial Crisis. This assessment 
runs contrary to most accounts, because the US market actually went on to scale 
new heights in 2000. However, because of the high volatility between 1998 and 
2000, the upward trend within this period must be interpreted very carefully. In 
comparison, the local trend between 2004 to 2007 is statistically much more sig- 
nificant, because of the low volatility within this period. While the February 2007 
market crash known as the Chinese Correction might have played an important 
role, we see that there are earlier signs for the start of global economic decline 
in mid-2007. This is an unnamed market event in May 2006, also related to cor- 
rection within the Chinese markets. All in all, the story that unfolded from our 
analysis of the DJl time series tells us how the global economies are so coupled to 
each other, that structural transitions in one market eventually propagates to most 
markets around the world. 

Presently, we have initiated a comparative study of the Nikkei 225 and the DJI 
over the same period (January 1997 to August 2008), to see whether there are sta- 
tistical signatures that point to causal links between the US and Japanese markets. 
At the same time, we are replicating the analysis for the Dow Jones family of US 
economic sector time series, to search for causal links between diff'erent economic 
sectors. We hope this more extensive analysis will tell us which economic sectors 
follow which other economic sectors into decline during a financial crisis. We 
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also hope to see which economic sectors lead the economic recovery, and which 
economic sectors are lifted up by others as the economy recovers. Ultimately, 
a better understanding of the causal relationships between economic sectors will 
hint to more effective, and less costly stimulus measures. 
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